Columbia
RIFT: A Scalable Methodology for LLM Accelerator Fault Assessment using Reinforcement Learning
Khalil, Khurram, Khaliq, Muhammad Mahad, Hoque, Khaza Anuarul
Abstract--The massive scale of modern AI accelerators presents critical challenges to traditional fault assessment methodologies, which face prohibitive computational costs and provide poor coverage of critical failure modes. This paper introduces RIFT (Reinforcement Learning-guided Intelligent Fault T argeting), a scalable framework that automates the discovery of minimal, high-impact fault scenarios for efficient design-time fault assessment. RIFT transforms the complex search for worst-case faults into a sequential decision-making problem, combining hybrid sensitivity analysis for search space pruning with reinforcement learning to intelligently generate minimal, high-impact test suites. Evaluated on billion-parameter Large Language Model (LLM) workloads using NVIDIA A100 GPUs, RIFT achieves a 2.2 fault assessment speedup over evolutionary methods and reduces the required test vector volume by over 99% compared to random fault injection, all while achieving superior fault coverage. The proposed framework also provides actionable data to enable intelligent hardware protection strategies, demonstrating that RIFT -guided selective error correction code provides a 12.8 improvement in cost-effectiveness (coverage per unit area) compared to uniform triple modular redundancy protection. RIFT automatically generates UVM-compliant verification artifacts, ensuring its findings are directly actionable and integrable into commercial RTL verification workflows. The recent advent of Large Language Models (LLMs) with hundreds of billions of parameters has had a transformative impact on computing, but has also introduced unprecedented computational demands [1].
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.68)
FlipLLM: Efficient Bit-Flip Attacks on Multimodal LLMs using Reinforcement Learning
Khalil, Khurram, Hoque, Khaza Anuarul
Abstract--Generative Artificial Intelligence Models like Large Language Models (LLMs) and Large Vision Models (VLMs) exhibit state-of-the-art performance across a wide range of tasks but remain vulnerable to hardware-based threats, specifically bit-flip attacks (BF As), posing a serious risk to their security in safety-critical applications. Existing BF A discovery methods--gradient-based, static analysis, and search-based--lack generalizability and struggle to scale, often failing to analyze the vast parameter space and complex interdependencies of modern foundation models in a reasonable time. This paper proposes FlipLLM, a reinforcement learning (RL) architecture-agnostic framework that formulates BF A discovery as a sequential decision-making problem. FlipLLM combines sensitivity-guided layer pruning with Q-learning to efficiently identify minimal, high-impact bit sets capable of inducing catastrophic failure. We demonstrate the effectiveness and generalizability of FlipLLM by applying it to a diverse set of models, including prominent text-only LLMs (GPT -2 Large, LLaMA 3.1 8B, and DeepSeek-V2 7B), VLMs such as LLaV A 1.6, and datasets, such as MMLU, MMLU-Pro, VQA v2, and T extVQA. Our results show that FlipLLM can identify critical bits that are vulnerable to BF As up to 2.5 faster than SOT A methods. We demonstrate that flipping the FlipLLM-identified bits plummets the accuracy of LLaMA 3.1 8B from 69.9% to 0.2%, and for LLaV A's VQA score from 78% to almost 0%, by flipping as few as 5 and 7 bits, respectively. Further analysis shows that applying standard hardware protection mechanisms, such as ECC SECDED, to the FlipLLM-identified bit locations completely mitigates the BF A impact, demonstrating the practical value of our framework for guiding hardware-level defenses. FlipLLM offers the first scalable and adaptive methodology for exploring the BF A vulnerability of both language and multimodal foundation models, paving the way for comprehensive hardware-security evaluation. Generative Artificial Intelligence models like Large Language Models (LLMs) [1] and Large Vision Models (VLMs) represent a transformative advancement in artificial intelligence, finding integration into mission-critical systems spanning healthcare, finance, and autonomous navigation [2], [3]. Their effective deployment mandates reliable and secure operation across diverse hardware infrastructures, from expansive cloud accelerators to resource-constrained edge devices.
- North America > United States > Missouri > Boone County > Columbia (0.04)
- Europe > Netherlands (0.04)
- Asia > Middle East > Iran > Tehran Province > Tehran (0.04)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.54)
Modeling Spatio-temporal Extremes via Conditional Variational Autoencoders
Ma, Xiaoyu, Zhang, Likun, Wikle, Christopher K.
Extreme weather events are widely studied in fields such as agriculture, ecology, and meteorology. The spatio-temporal co-occurrence of extreme events can strengthen or weaken under changing climate conditions. In this paper, we propose a novel approach to model spatio-temporal extremes by integrating climate indices via a conditional variational autoencoder (cXVAE). A convolutional neural network (CNN) is embedded in the decoder to convolve climatological indices with the spatial dependence within the latent space, thereby allowing the decoder to be dependent on the climate variables. There are three main contributions here. First, we demonstrate through extensive simulations that the proposed conditional XVAE accurately emulates spatial fields and recovers spatially and temporally varying extremal dependence with very low computational cost post training. Second, we provide a simple, scalable approach to detecting condition-driven shifts and whether the dependence structure is invariant to the conditioning variable. Third, when dependence is found to be condition-sensitive, the conditional XVAE supports counterfactual experiments allowing intervention on the climate covariate and propagating the associated change through the learned decoder to quantify differences in joint tail risk, co-occurrence ranges, and return metrics. To demonstrate the practical utility and performance of the model in real-world scenarios, we apply our method to analyze the monthly maximum Fire Weather Index (FWI) over eastern Australia from 2014 to 2024 conditioned on the El Niño/Southern Oscillation (ENSO) index.
- North America > United States > Missouri > Boone County > Columbia (0.14)
- Oceania > Australia > New South Wales (0.04)
- Oceania > Australia > Queensland (0.04)
- (2 more...)
Artificial Intelligence Competence of K-12 Students Shapes Their AI Risk Perception: A Co-occurrence Network Analysis
Heilala, Ville, Sikström, Pieta, Setälä, Mika, Kärkkäinen, Tommi
As artificial intelligence (AI) becomes increasingly integrated into education, understanding how students perceive its risks is essential for supporting responsible and effective adoption. This research aimed to examine the relationships between perceived AI competence and risks among Finnish K-12 upper secondary students (n = 163) by utilizing a co-occurrence analysis. Students reported their self-perceived AI competence and concerns related to AI across systemic, institutional, and personal domains. The findings showed that students with lower competence emphasized personal and learning-related risks, such as reduced creativity, lack of critical thinking, and misuse, whereas higher-competence students focused more on systemic and institutional risks, including bias, inaccuracy, and cheating. These differences suggest that students' self-reported AI competence is related to how they evaluate both the risks and opportunities associated with artificial intelligence in education (AIED). The results of this study highlight the need for educational institutions to incorporate AI literacy into their curricula, provide teacher guidance, and inform policy development to ensure personalized opportunities for utilization and equitable integration of AI into K-12 education.
- North America > United States > District of Columbia > Washington (0.05)
- Europe > Finland > Central Finland > Jyväskylä (0.05)
- North America > United States > New York > New York County > New York City (0.04)
- (3 more...)
- Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
- Information Technology > Artificial Intelligence > Applied AI (0.95)
- Information Technology > Artificial Intelligence > Natural Language (0.94)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Health App Reviews for Privacy & Trust (HARPT): A Corpus for Analyzing Patient Privacy Concerns, Trust in Providers and Trust in Applications
Kelly, Timoteo, Korkmaz, Abdulkadir, Mallet, Samuel, Souders, Connor, Aliakbarpour, Sadra, Rao, Praveen
Background: User reviews of Telehealth and Patient Portal mobile applications (apps) hereon referred to as electronic health (eHealth) apps are a rich source of unsolicited patient feedback, revealing critical insights into patient perceptions. However, the lack of large-scale, annotated datasets specific to privacy and trust has limited the ability of researchers to systematically analyze these concerns using natural language processing (NLP) techniques. Objective: This study aims to develop and benchmark Health App Reviews for Privacy & Trust (HARPT), a large-scale annotated corpus of patient reviews from eHealth apps to advance research in patient privacy and trust. Methods: We employed a multistage data construction strategy. This integrated keyword-based filtering, iterative manual labeling with review, targeted data augmentation, and weak supervision using transformer-based classifiers. A curated subset of 7,000 reviews was manually annotated to support machine learning model development and evaluation. The resulting dataset was used to benchmark a broad range of models. Results: The HARPT corpus comprises 480,000 patient reviews annotated across seven categories capturing critical aspects of trust in the application (TA), trust in the provider (TP), and privacy concerns (PC). We provide comprehensive benchmark performance for a range of machine learning models on the manually annotated subset, establishing a baseline for future research. Conclusions: The HARPT corpus is a significant resource for advancing the study of privacy and trust in the eHealth domain. By providing a large-scale, annotated dataset and initial benchmarks, this work supports reproducible research in usable privacy and trust within health informatics. HARPT is released under an open resource license.
- North America > United States > Missouri > Boone County > Columbia (0.15)
- North America > United States > Connecticut > New Haven County > New Haven (0.04)
- North America > Canada > Ontario > Waterloo Region > Waterloo (0.04)
- Asia > Middle East > Republic of Türkiye (0.04)
- Information Technology (1.00)
- Health & Medicine > Health Care Technology > Telehealth (1.00)
Privacy Preserving In-Context-Learning Framework for Large Language Models
Bhusal, Bishnu, Acharya, Manoj, Kaur, Ramneet, Samplawski, Colin, Roy, Anirban, Cobb, Adam D., Chadha, Rohit, Jha, Susmit
Large language models (LLMs) have significantly transformed natural language understanding and generation, but they raise privacy concerns due to potential exposure of sensitive information. Studies have highlighted the risk of information leakage, where adversaries can extract sensitive information embedded in the prompts. In this work, we introduce a novel private prediction framework for generating high-quality synthetic text with strong privacy guarantees. Our approach leverages the Differential Privacy (DP) framework to ensure worst-case theoretical bounds on information leakage without requiring any fine-tuning of the underlying models. The proposed method performs inference on private records and aggregates the resulting per-token output distributions. This enables the generation of longer and coherent synthetic text while maintaining privacy guarantees. Additionally, we propose a simple blending operation that combines private and public inference to further enhance utility. Empirical evaluations demonstrate that our approach outperforms previous state-of-the-art methods on in-context-learning (ICL) tasks, making it a promising direction for privacy-preserving text generation while maintaining high utility. Our code is available at https://github.com/bhusalb/
- North America > United States > Missouri > Boone County > Columbia (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- North America > United States > California > San Mateo County > Menlo Park (0.04)
Active Learning and Explainable AI for Multi-Objective Optimization of Spin Coated Polymers
Young, Brendan, Alvey, Brendan, Werbrouck, Andreas, Murphy, Will, Keller, James, Young, Matthias J., Maschmann, Matthew
Spin coating polymer thin films to achieve specific mechanical properties is inherently a multi-objective optimization problem. We present a framework that integrates an active Pareto front learning algorithm (PyePAL) with visualization and explainable AI techniques to optimize processing parameters. PyePAL uses Gaussian process models to predict objective values (hardness and elasticity) from the design variables (spin speed, dilution, and polymer mixture), guiding the adaptive selection of samples toward promising regions of the design space. To enable interpretable insights into the high-dimensional design space, we utilize UMAP (Uniform Manifold Approximation and Projection) for two-dimensional visualization of the Pareto front exploration. Additionally, we incorporate fuzzy linguistic summaries, which translate the learned relationships between process parameters and performance objectives into linguistic statements, thus enhancing the explainability and understanding of the optimization results. Experimental results demonstrate that our method efficiently identifies promising polymer designs, while the visual and linguistic explanations facilitate expert-driven analysis and knowledge discovery.
- Information Technology > Data Science > Data Mining (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- North America > United States > Missouri > Boone County > Columbia (0.13)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.13)
- Europe > Switzerland > Basel-City > Basel (0.04)
- (2 more...)
- Research Report (1.00)
- Workflow (0.67)
- North America > United States > Rhode Island > Providence County > Providence (0.14)
- North America > United States > Missouri > Boone County > Columbia (0.14)
- North America > Canada > Ontario > Toronto (0.14)
- (7 more...)
- North America > United States > Minnesota (0.04)
- Europe > Switzerland > Basel-City > Basel (0.04)
- North America > United States > Missouri > Boone County > Columbia (0.04)
- (2 more...)
- Information Technology > Sensing and Signal Processing > Image Processing (1.00)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)